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Abstract 

We prove that a random 3-SAT instance with clause-to-variable 
density less than 3.52 is satisfiable with high probability. The proof 
comes through an algorithm which selects (and sets) a variable de- 
pending on its degree and that of its complement. 

1 Introduction 

There is much interest in understanding "phase transitions" in mathematics, 
computer science, and mathematical physics, and in particular the &-SAT 
phase transition. In the standard model for random fc-SAT, a random k- 
CNF formula F(n, en) with n variables and density c has m — en random 
clauses independently selected uniformly at random, with replacement, from 
among the 2 k (^) proper clauses of length k. The Satisfiability Threshold 
Conjecture asserts that for each k ^ 2, there exists a constant c& such that 
for all constants c < c^, F(n 7 cn) is a.a.s. (asymptotically almost surely) 
satisfiable, while for c > c& it is a.a.s. unsatisfiable. 

The case of 2-SAT is well understood, with Chvatal and Reed [CR92], 
Geordt [Goe96], and Fernandez de la Vega [FdlV92] independently proving 
that C2 = 1, and Bollobas, Borgs, Chayes, Kim, and Wilson [BBC + ] deter- 
mined the "scaling window" to be 1 + @(n -1 ' 3 ). For k > 2, the conjecture 
remains open. Friedgut proved that for any k and n there is sharp threshold 
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Cfc(n), leaving open whether Cfc(n) has a limit C&. With the threshold be- 
havior not understood, considerable attention has been devoted to proving 
density bounds below which a formula is a.a.s. satisfiable ("lower bounds" 
on the putative threshold) and bounds above which it is a.a.s. unsatisfiable 
("upper bounds"). For k — 3, it is conjectured that C3 « 4.2, and the best 
upper bound is 4.596 [JSVOO]. 1 

Existing lower bounds for 3-SAT are all algorithmically based. (By con- 
trast, new lower bounds for fc-SAT are based on the second-moment method 
[AM02, AP03].) The earliest such bound of 1.63, due to Broder, Frieze and 
Upfal [BFU93] was based on the "pure- literal rule" : successively setting to 
True literals whose complement does not appear, "reducing" the formula, 
and repeating. The next, 3.003, due to Frieze and Suen [FS96], used the 
"shortest-clause rule", setting True a random literal from a random short- 
est clause. Skipping over a bound of 3.145 by Achlioptas [AchOO], a bound 
of 3.14 due to Achlioptas and Sorkin [AS00] again selects a literal from a 
shortest clause, but when the literal is from a 2-clause (the case of interest) 
it sets the literal either True or False depending (optimally) on the num- 
ber of other occurrences of the literal and its negation in 2- and 3-clauses; 
[AS00] extends this to a version which optimally chooses to set one or two 
literals at a time, and sets them optimally, for a bound of 3.26. [AS00] 
suggests that better bounds may require looking at literal-degree informa- 
tion, in some way harking back to [BFU93]. This approach was taken up 
by Kaporis, Kirousis, and Lalas [KKL02], whose algorithm sets a variable 
of largest degree (a "1-parameter heuristic") to give a bound of 3.42. It was 
clear that the same approach could be exploited further. 

2 Result, significance, and open problems 

In this paper, we choose a variable according to its degree and that of its 
complement (a 2-parameter heuristic), to get a bound of 3.52. Kaporis 
and Lalas [KL], using a similar but not identical heuristic, independently 
obtained the same bound at around the same time. The purpose of this 
short abstract is twofold. 

First, since the bounds from all heuristics of this sort rely on numerical 
calculations (notably, solutions to differential equations) , it is important to 
put on record that our calculations and those of [KL] independently justify 
a value of 3.52, and that we reproduce the 3.42 bound of [KKL02]. 

1 A bound of 4.506 due to Dubois, Boufkhad, and Mandler [DBM00] has not appeared 
in journal-refereed form. 



Second, our heuristic (and that of [KL]) efficiently solves denser random 
instances than any other theoretically justified algorithms; since solving 3- 
SAT instances is of practical importance, our algorithm may be of practical 
utility. In that regard, a few remarks. Our heuristic succeeds only with 
a probability that is asymptotically bounded away from 0, but exploiting 
a standard one-step backtracking trick brings the asymptotic probability 
to 1. Also, the algorithms commonly used in practice are Davis-Putnam- 
type backtracking procedures, quite different from the "greedy" approaches 
taken in all the works described above. However, it is easy to imagine using 
the present heuristic as a selection rule for a Davis-Putnam algorithm, pre- 
serving the heuristic's theoretically justified behavior on random instances, 
while gaining the Davis-Putnam algorithm's guarantee of a correct answer 
on arbitrary instances. 

Not present in this short abstract is a rigorous justification of our proof 
methodology (fairly easy and familiar), nor of the numerical calculations, 
which to be done rigorously would require theoretically derived Lipschitz 
bounds on a derivative, and interval-arithmetic calculations employing those 
bounds, along with a few other technicalities. 

Future work could include consideration of a variable's number of ap- 
pearances and that of its complement separately in 2-clauses and 3-clauses 
(a 4-parameter heuristic), which is analyzable in the same framework. In 
at least the 2- and 4-parameter versions of literal-degree heuristics (as op- 
posed to the 1-parameter version), it is not clear how best to select a next 
literal: does a (2, 3) literal (2 positive appearances, 3 negative) trump a 
(4,5), or vice- versa? In the 4-parameter version, it is also not clear how 
best to set a chosen literal; this was the question answered in [AS00] for the 
non-degree-spectrum case. An optimal solution to these questions would be 
a most interesting theoretical contribution, and could also give significant 
improvements in the bounds. 

3 Algorithm 

We call a variable with i positive and j negative appearances an (in- 
variable. Our algorithm is defined as follows. 

Algorithm A 

Input: A 3-CNF formula. 

begin 

1 while there exists an unset variable 

2 choose an (i, j)-degree variable using a selection rule 

3 set v True if i < j and False otherwise 



4 while there exists a unit clause 

5 set a literal of an arbitrary unit clause True. 

6 if an empty clause is generated report failure; otherwise report success 
end 

The best selection rule we found was this. If there is a "pure" variable 
(one with i — or j — 0), select it. Otherwise, choose a variable with 
maximum discrepancy \i — j\, breaking ties in favor of maximal i + j. (This 
identifies a unique unordered pair {?, j}, and all variables with those degrees 
are indistinguishable to the algorithm.) This selection rule satisfies formulas 
up to density 3.52. Other selection rules we tried were less good. Working 
as above but breaking ties in favor of minimal i + j only worked up to 
density 3.50. Selecting by maximum i/j instead of i — j only worked up 
to 3.44. Selecting by maximum max{i, j} is equivalent to the approach of 
[KKL02], and we reproduce their 3.42. 

4 Analysis 

In truth, the "natural" algorithm above is not the one analyzed. Rather than 
making 0(n) iterations, the analyzed algorithm makes a constant number 
of iterations, in each of which it sets ®(n) variables with common degree 

{hJ}- 

It is easily verified that during the algorithm, the formula remains uni- 
formly random conditioned on its degree sequence. To make the calculations 
finite, we truncate the degree sequence at some value h (h — 31 in our calcu- 
lations). Then with n the original number of variables, for i,j < h we let riij 
be 1/nth the current number of variables of degree (i, j); n^j (and n,^) the 
similar value for variables with ^ h positive (negative) appearances; and 
nfi,h that for variables with ^ h positive and negative appearances. Set- 
ting a single (i 7 j) -variable produces straightforwardly computable expected 
changes A (detailed in Appendix A) to the /i 2 -dimensional vector S of val- 
ues riij, and each element of A has order only 0(l/n), so the differential 
equation method (see for example [Wor95]) can be used to prove that, as we 
set ®(n) variables with common degree (i, j), the vector riij almost surely 
almost exactly follows a trajectory described by the solution of a differential 
equation corresponding to A. 

So instead of selecting an (i, j)-variable as in Algorithm A, we use the 
same selection rule to select a pair (i,j), i 7 j ^ /i, and we set n • min{<5, riij} 
(i, j)-variables at once. Here 6 is a value of our choosing, which could vary 
from round to round, but which we fixed at 10~ 6 . Each such round (in- 
cluding the unit-clause steps it implies) can be described by the differential 



equation method, and our analysis simply consists of simulating the differ- 
ential equations for a constant number of rounds. It is clear that after some 
number of rounds, all the values in S can be made arbitrarily small, and at 
that point we apply the main theorem of Cooper, Frieze, and Sorkin [CFS02] 
to show that the remaining formula is satisfiable a.a.s. (The positive side 
of their result has a natural algorithmic interpretation, so our procedure 
remains algorithmic to the end.) 

5 Differential Equations 

In this section, we describe the differential equations for the case in which 
we have a 2-dimensional table for keeping the expected number of variables 
with k < h positive appearances and I < h negative appearances. Since 
dt (the "time parameter" described before) is very small, w.l.o.g. we can 
assume all values of S remain fixed during a round. Using these values, we 
obtain the new value of S after a round. Suppose we set a variable from cell 
rifc,/ True (in other words, set a (k, Z)-degree variable True). Then writing 6 
to denote the expected increase to a parameter, for such a "free move" we 
have: 
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where S m = S n = 2 • 7772 + 3 • 7773 is the total density of appearances of all 
variables and 7771 is the expected number of unit clauses generated by this 
free move. Here tp(x 7 y) — 1 if x — y and zero otherwise. Note also that 
since by definition 7771 = at the start of a round, at the end, 7771 = <5t77i as 
given above. 

After a free move, we have a number of "forced moves" in which the 
literals in all 7771 unit clauses must be set True to satisfy our formula. A 
literal in a unit clause is a variable from cell (fc + 1,/) with probability 
- — ijTC''" 1 " 1 '' , or the negation of a variable from cell (&, / + 1) with probability 



- — g^i+i. I n either case, in the rest of the formula (excepting the unit 
clauses) that variable has degree (k, I). Thus the expected number p of new 
unit clauses produced by one such forced move (the Malthus parameter in 
our Galton- Watson process) is 

v^ (fc / + l)rcfc'+i,i' ;/ 2 -m 2 {V + Vynyj^+i ,/ g-mg 

0^',*'<fc ^ n ^ m ^ n ^ m 

where Z'^p 2 (see parameter mi defined above) is the density of new unit 
clauses after setting a (&', Z')-degree variable True (which happens with prob- 

ability £ - +1 '' ) and fc'-^p 2 - is the density of new unit clauses after setting 

a (&', Z')-degree variable False (which happens with probability e*'''' +1 ). 

For such a forced move, the expected parameter changes are: 

5 , m ^ ^ (k' + ^±l^ M 3 (fc',/',T) + { ~ + l l nk '' V+1 SM,{k\l\n 

ZJ71 2~in 

0^k',l'<h n n 

8 , m ^ y (*' + ^±l^ M 2 (fc',;',T) + { - + ^±i fM 2 (fc',f',F) > 
0^k',l'<h n n 

where 6Ms(k' ',/' ',T) has exactly the same formula as <5m3 defined above, 
likewise for 6M2(k' 7 l' 7 T) and dm?-, and where by symmetry 6M^{k' ',/' ',F) = 
<5M 3 (Z',A;',T) and <5M 2 (A;',Z',F) = <5M 2 (/', fc',T). 
Finally, for each ? and j, 

0^k',l'<h n 

-^(«,A:' + l)-^(j,0) 

+ y + 1 ^ fc ' /+1 ( < 5iV M (fc / , /', F) - tf(i, fc') • ^(J, /' + !)), 
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(i + j) • njj (i + 1) ■ n i+ld {j + 1) • njj+i ' , and 



^n S n 



6N id {k,l,F) = 6Nij{l,k,T). 

We note that the formula for <5'njj can be obtained by considering the flow 
which goes in or out for cell riij. 



Reasoning via the Galton- Watson process, we know that the expected 
number of forced moves is j^-. Thus the new expected value of S af- 
ter setting a small fraction dt of variables from cell n^/ True is: S + 
dt(5m,2, Sm,3, Sn) + dtj^-(6'm2 7 S'm,3 7 S'n). If we set a variable from cell n^ 
False, the expected changes can be obtained by just swapping the role of k 
and I in the above description. 
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